NYU: Description of the MENE Named Entity System as Used in MUC-7

نویسندگان

  • Andrew Borthwick
  • John Sterling
  • Eugene Agichtein
  • Ralph Grishman
چکیده

This paper describes a new system called \Maximum Entropy Named Entity" or \MENE" (pronounced \meanie") which was NYU's entrant in the MUC-7 named entity evaluation. By working within the framework of maximum entropy theory and utilizing a exible object-based architecture, the system is able to make use of an extraordinarily diverse range of knowledge sources in making its tagging decisions. These knowledge sources include capitalization features, lexical features and features indicating the current type of text (i.e. headline or main body). It makes use of a broad array of dictionaries of useful single or multi-word terms such as rst names, company names, and corporate su xes. These dictionaries required no manual editing and were either downloaded from the web or were simply \obvious" lists entered by hand.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

Description of the Oki System as Used for MUC-7

This paper describes the Oki Information Extraction system as used for MUC-7 evaluation [1][2]. The tasks we have conducted are Named Entity, Co-reference, Template Element and Template Relation. Each module is implemented using MT system modules and pattern recognition modules. Our purposes to participate MUC-7 evaluation are to evaluate howMT system modules are e ective for other application ...

متن کامل

Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition

This paper describes a novel statistical namedentity (i.e. "proper name") recognition system built around a maximum entity framework. By working v,ithin the framework of maximum entropy theory and utilizing a flexible object-based architecture, the system is able to make use of an extraordinarily diverse range of knowledge sources in making its tagging decisions. These knowledge sources include...

متن کامل

IsoQuest Inc.: Description Of The NetOwl (TM) Extractor System As Used For MUC-7

IsoQuest used its commercial software product, NetOwl Extractor, for the MUC-7 Named Entity task. The product consists of a high-speed C engine that analyzes text based on a configuration file containing a pattern rule base and lexicon. IsoQuest used the NameTag Configuration to recognize proper names and other key phrases in text, and mapped the product’s extraction tags to the MUC-7 NE tags. ...

متن کامل

NYU: Description of the Proteus/PET System as Used for MUC-7 ST

Through the history of the MUC's, adapting Information Extraction (IE) systems to a new class of events has continued to be a time-consuming and expensive task. Since MUC-6, the Information Extraction e ort at NYU has focused on the problem of portability and customization, especially at the scenario level. To begin to address this problem, we have built a set of tools, which allow the user to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998